AITopics | lama ct

Collaborating Authors

lama ct

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Benchmarking Generative Models on Computational Thinking Tests in Elementary Visual Programming

Neural Information Processing SystemsFeb-15-2026, 16:18:19 GMT

Generative models have demonstrated human-level proficiency in various benchmarks across domains like programming, natural sciences, and general knowledge.

large language model, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.67)

Industry:

Law (0.92)
Information Technology (0.92)
Government (0.67)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
(2 more...)

Add feedback

Benchmarking Generative Models on Computational Thinking Tests in Elementary Visual Programming

Neural Information Processing SystemsOct-10-2025, 05:24:15 GMT

Generative models have demonstrated human-level proficiency in various benchmarks across domains like programming, natural sciences, and general knowledge.

avatar, dataset, grid, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.67)

Industry:

Law (0.92)
Information Technology (0.92)
Government (0.67)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
(2 more...)

Add feedback

Benchmarking Generative Models on Computational Thinking Tests in Elementary Visual Programming

Neural Information Processing SystemsOct-10-2025, 05:24:11 GMT

Generative models have demonstrated human-level proficiency in various benchmarks across domains like programming, natural sciences, and general knowledge.

benchmark, lama ct, student, (14 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.72)

Add feedback

Benchmarking Generative Models on Computational Thinking Tests in Elementary Visual Programming

Pădurean, Victor-Alexandru, Singla, Adish

arXiv.org Artificial IntelligenceJun-14-2024

Generative models have demonstrated human-level proficiency in various benchmarks across domains like programming, natural sciences, and general knowledge. Despite these promising results on competitive benchmarks, they still struggle with seemingly simple problem-solving tasks typically carried out by elementary-level students. How do state-of-the-art models perform on standardized tests designed to assess computational thinking and problem-solving skills at schools? In this paper, we curate a novel benchmark involving computational thinking tests grounded in elementary visual programming domains. Our initial results show that state-of-the-art models like GPT-4o and Llama3 barely match the performance of an average school student. To further boost the performance of these models, we fine-tune them using a novel synthetic data generation methodology. The key idea is to develop a comprehensive dataset using symbolic methods that capture different skill levels, ranging from recognition of visual elements to multi-choice quizzes to synthesis-style tasks. We showcase how various aspects of symbolic information in synthetic data help improve fine-tuned models' performance. We will release the full implementation and datasets to facilitate further research on enhancing computational thinking in generative models.

grid, lama ct, student, (13 more...)

arXiv.org Artificial Intelligence

2406.09891

Country: North America > United States > New York (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Consumer Health (0.34)
Education > Assessment & Standards > Student Performance (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.92)

Add feedback

Filters

Collaborating Authors

lama ct

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Benchmarking Generative Models on Computational Thinking Tests in Elementary Visual Programming

6d5e00006b65fcc55c3c1798da821663-Paper-Datasets_and_Benchmarks_Track.pdf

Benchmarking Generative Models on Computational Thinking Tests in Elementary Visual Programming

Benchmarking Generative Models on Computational Thinking Tests in Elementary Visual Programming

Benchmarking Generative Models on Computational Thinking Tests in Elementary Visual Programming